soimort/you-get

rg3/youtube-dl

python爬虫项目集合

Awesome-crawler-cn

Python-crawler

woaidu_crawler

crawler_html2pdf

sam408130/crawler

Python系列文章

用于抓取百度百科中的百科名片及列表部分信息

python多线程爬虫爬取电影天堂资源

LinkedIn-Crawler

SparrowG/LinkedIn-Crawler

NyCrawler

Yandere-crawler

infoqCrawler

下厨房爬虫

zhihu-crawler

Scrapy-Redis

神箭手 云爬虫 爬取规则示例

ovwane/python_crawler

README.md
Crawler
收集网络上大神们写的爬虫代码。

spider

chenqing/spider

imchenkun/ick-spider

Otakuwizard/spider

Germey/Weibo

Scrapy

SpiderKeeper

scrapy/scrapyd

Germey/Gerapy

sam408130/scrapy

Wooden-Robot/scrapy-tutorial

LiuXingMing/Scrapy_Redis_Bloomfilter

scrapy-redis代码研究

#代理
ProxyCrawler

awolfly9/IPProxyTool

Nyloner/ProxyPool

Germey/ProxyPool

Germey/CookiesPool

love3forever/proxyHunter

pujinxiao/IPProxyPool

qiyeboy/IPProxyPool

Germey/AutoProxy

#1

istresearch/scrapy-cluster

wangqifan/ZhiHu

otakurice/weibonlp

geekcompany/DeerResume

otakurice/notravellist

Nyloner/NyPython

otakurice/lagoujob

#简单爬虫
xiaozhiqi2016/spider_basic

Germey/TaobaoProduct

Germey/Cnki

Germey/TaobaoMM

Wooden-Robot/Pythonspider

bigstupidx/Pythonspider

StephinChou/Pythonspider

Germey/MeiKong

#爬虫资料汇总
KDF5000/SpiderRef

Ehco1996/Python-crawler

luyishisi/Nyspider

Nyloner/Nyspider

#单个网站爬虫
fankcoder/spider-comments

wwj718/jobSpider

zaxlct/baike-spider

hk029/LagouSpider

Jack-Cherish/python-spider

wanglu119/distributed-spider

pujinxiao/Lagou_spider

Wooden-Robot/spider-practice

sam408130/base_spider

LJ147/githubSpider

awesome-spider

LiuXingMing/SinaSpider

pujinxiao/sina_spider

xiaozhiqi2016/spider_advanced

LiuXingMing/LinkedinSpider

pujinxiao/zhihu_spider

KDF5000/RuolinSpider

hk029/NovelSpider

pujinxiao/jobbole_spider

Germey/TouTiao

Germey/Zhihu

Germey/Weixin

Germey/MaoYan

hk029/DoubanMovieSpider

hk029/BaiduTiebaSpider

Germey/TaobaoComments

Germey/iaskspider

tpeng/weibosearch

piglei/nowater

l-passer/Passer-zhihu

rg3/youtube-dl

gnemoug/sina_reptile

sam408130/parse-baidu-baike

piglei/tieba_poster

darkhandz/BaiduLoginWithTiebaSignin

ovwane/jdlingyu

pujinxiao/weixin

pujinxiao/crops_pider

pujinxiao/TaoBao

pujinxiao/wechat_sogou_crawl

jaryee/wechat_sogou_crawl

pujinxiao/qiantuwang

fankcoder/findtrip

echopy/QQSpider

thuxugang/QQSpider

echopy/Maizi

echopy/Baidu_Registration

ovwane/jdlingyu

dongweiming/weapp-zhihulive

LiuXingMing/Tmall1212

LiuXingMing/QQSpider

Germey/TaobaoUser

BruceDone/cnbeta

hk029/Pickup

hk029/doubanbook

DormyMo/scrappy

webdriver

asonhan007/webdriver_guide

#PySpider 爬虫

Germey/PySpiders

rmax/scrapy-redis

LJ147/ImageSpyder

#验证码
知乎
iyaopinner/zheye

muchrooms/zheye

Germey/Python3

LJ147/github-trending

pujinxiao/zhilian

LiuXingMing/WeiboSliderCode

Germey/crack-geetest

dzhongyi/crack-geetest

DormyMo/ProxyCrawler

#登录
xchaoinfo/fuck-login

pujinxiao/douban_login

LiuXingMing/WeiboSliderCode

UA

hellysmile/fake-useragent

#爬虫知乎专栏

从零开始写Python爬虫

Python之美-董伟明
python分布式爬虫打造搜索引擎——–scrapy实现
pujinxiao/project_pjx

七夜安全爬虫 490个项目

#数据分析

wwj718/Zhihu_bigdata

yoghurtjia/Zhihu_bigdata

针对常见的BAT公司中的大数据面试和笔试问题,列出解决思路,并使用python来实现

yoghurtjia/sortquery 有10个文件,每个文件1G,每个文件的每行存放的都是用户的query(请自己随机产生),每个文件的query都可能重复。要求你按照query的频度排序。

关于淘宝“爆款”数据爬取与分析
adbmal/sortquery

KDF5000/RqFetchData

#爬虫监控
pc10201/markets_monitor

#学习记录
yoghurtjia/github-pages

zzbkszd/github-pages

×

纯属好玩

扫码支持
扫码打赏,你说多少就多少

打开支付宝扫一扫,即可进行扫码打赏哦

文章目录
  1. 1. spider
  2. 2. Scrapy
  3. 3. webdriver
  4. 4. UA
,